Semi-automatic Building of Swedish Collocation Lexicon

نویسندگان

  • Silvie Cinková
  • Pavel Pecina
  • Petr Podveský
  • Pavel Schlesinger
چکیده

This work focuses on semi-automatic extraction of verb-noun collocations from a corpus, performed to provide lexical evidence for the manual lexicographical processing of Support Verb Constructions (SVCs) in the Swedish-Czech Combinatorial Valency Lexicon of Predicate Nouns. Efficiency of pure manual extraction procedure is significantly improved by utilization of automatic statistical methods based lexical association measures.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic Detection of Collocation

Collocation is a very important relation between words, which can be widely applied to semantic parsing (e.g., word sense disambiguation), machine translation (e.g., automatic alignment of bilingual corpus), computational lexicon, etc. Firstly, we summarized the methods of likelihood interval, likelihood ratio test, u test and χ test for collocation theoretically, and then utilized them to extr...

متن کامل

Developing and Evaluating a Searchable Swedish-Thai Lexicon

We present an automatically created Swedish-Thai lexicon. The lexicon was created by matching the English translations in a Thai-English and a Swedish-English lexicon. The search interface to the lexicon includes several NLP tools to help the target group: second language learners of Swedish. These include automatic generation of inflectional forms of words, automatic spelling correction, lemma...

متن کامل

Information Retrieval with Language Knowledge

The introduction of Swedish made it possible for Lexware to be tested for the first time in CLEF. Lexware is a natural language system applied in an information retrieval task and not an information retrieval systems using NLP techniques, therefore it is interesting to compare its results with other less odd IR systems. We experience that separate evaluation of document description and query bu...

متن کامل

MedLex+: An Integrated Corpus-Lexicon Medical Workbench for Swedish

This paper reports on the work carried out developing MedLex+, a medical corpuslexicon workbench for Swedish. This project, which is still under active development, has been going on for some years now within the Department of Swedish language at Göteborg University. At the moment, the workbench incorporates: an annotated collection of medical texts-including 20 million tokens and 45,000 docume...

متن کامل

Building a MWE Lexicon for Swedish (SweMWELex)

This paper describes a pilot study in building a Swedish multiword expression lexicon (SweMWELex) containing data collected from a corpus. Parts of the Swedish Treebank will be used for extracting the MWEs and the result will be stored in a publicity available XML framework which is called Lexical Markup Framework (LMF). The result gives a prototype lexicon which will be a good starting point t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006